Scalable Validation of Data Streams

نویسنده

  • Cheng Xu
چکیده

Xu, C. 2016. Scalable Validation of Data Streams. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1384. 51 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9600-5. In manufacturing industries, sensors are often installed on industrial equipment generating high volumes of data in real-time. For shortening the machine downtime and reducing maintenance costs, it is critical to analyze efficiently this kind of streams in order to detect abnormal behavior of equipment. For validating data streams to detect anomalies, a data stream management system called SVALI is developed. Based on requirements by the application domain, different stream window semantics are explored and an extensible set of window forming functions are implemented, where dynamic registration of window aggregations allow incremental evaluation of aggregate functions over windows. To facilitate stream validation on a high level, the system provides two second order system validation functions, model-and-validate and learn-and-validate. Model-and-validate allows the user to define mathematical models based on physical properties of the monitored equipment, while learn-and-validate builds statistical models by sampling the stream in realtime as it flows. To validate geographically distributed equipment with short response time, SVALI is a distributed system where many SVALI instances can be started and run in parallel on-board the equipment. Central analyses are made at a monitoring center where streams of detected anomalies are combined and analyzed on a cluster computer. SVALI is an extensible system where functions can be implemented using external libraries written in C, Java, and Python without any modifications of the original code. The system and the developed functionality have been applied on several applications, both industrial and for sports analytics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collaborative Filtering in Dynamic Streaming Environments

The increasing expansion of websites and their web usage necessitates increasingly scalable techniques for Web usage mining that can be better cast within the framework of mining evolving data streams [1, 5]. Despite recent developments in mining evolving Web clickstreams [3, 6], there has not been any investigation of the performance of collaborative filtering [2] in the demanding environment ...

متن کامل

Sensors Data-Stream Processing Middleware based on Multi-Agent Model

The goal of this study is to propose an architecture for an intelligent sensor data processing middleware. In order to fulfill the ambient assisted living data processing requirements we design a flexible and scalable architecture based on multi-agent model. This architecture allows acquisition, interpretation and aggregation of sensor data-streams. Our system is able to process different senso...

متن کامل

LDA Experimental Data of Three-Poster Jet Impingement System

During its near-ground hovering phase a Short Take-Off and Vertical Landing (STOVL) aircraft creates a complex three-dimensional flow field between jet streams, the airframe surface and the ground. A proper understanding and numerical prediction of this flow is important in the design of such aircraft. In this paper an experimental facility, used to gather validation data suitable for testing C...

متن کامل

Scalable Robust Monitoring of Large - Scale Data Streams

Online monitoring large-scale data streams has many important applications such as industrial quality control, signal detection, biosurveillance, but unfortunately it is highly non-trivial to develop scalable schemes that are able to tackle two issues of robustness concerns: (1) the unknown sparse number or subset of affected data streams and (2) the uncertainty of model specification for high-...

متن کامل

Corrections to “LD-Sketch: A Distributed Sketching Design for Accurate and Scalable Anomaly Detection in Network Data Streams”

In this article, we describe the corrections to our paper “LD-Sketch: A Distributed Sketching Design for Accurate and Scalable Anomaly Detection in Network Data Streams” published at IEEE INFOCOM 2014. We also clarify the complexity issue raised by some readers. 1 Corrections to Lemmas and Theorems

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016